RASE: recognition of alternatively spliced exons in C.elegans
نویسندگان
چکیده
MOTIVATION Eukaryotic pre-mRNAs are spliced to form mature mRNA. Pre-mRNA alternative splicing greatly increases the complexity of gene expression. Estimates show that more than half of the human genes and at least one-third of the genes of less complex organisms, such as nematodes or flies, are alternatively spliced. In this work, we consider one major form of alternative splicing, namely the exclusion of exons from the transcript. It has been shown that alternatively spliced exons have certain properties that distinguish them from constitutively spliced exons. Although most recent computational studies on alternative splicing apply only to exons which are conserved among two species, our method only uses information that is available to the splicing machinery, i.e. the DNA sequence itself. We employ advanced machine learning techniques in order to answer the following two questions: (1) Is a certain exon alternatively spliced? (2) How can we identify yet unidentified exons within known introns? RESULTS We designed a support vector machine (SVM) kernel well suited for the task of classifying sequences with motifs having positional preferences. In order to solve the task (1), we combine the kernel with additional local sequence information, such as lengths of the exon and the flanking introns. The resulting SVM-based classifier achieves a true positive rate of 48.5% at a false positive rate of 1%. By scanning over single EST confirmed exons we identified 215 potential alternatively spliced exons. For 10 randomly selected such exons we successfully performed biological verification experiments and confirmed three novel alternatively spliced exons. To answer question (2), we additionally used SVM-based predictions to recognize acceptor and donor splice sites. Combined with the above mentioned features we were able to identify 85.2% of skipped exons within known introns at a false positive rate of 1%. AVAILABILITY Datasets, model selection results, our predictions and additional experimental results are available at http://www.fml.tuebingen.mpg.de/~raetsch/RASE SUPPLEMENTARY INFORMATION: http://www.fml.tuebingen.mpg.de/raetsch/RASE.
منابع مشابه
Searching for Regulatory Elements of Alternative Splicing Events Using Phylogenetic Footprinting
We consider the problem of finding candidates for regulatory elements of alternative splicing events from orthologous genes, using phylogenetic footprinting. The problem is formulated as follows: We are given orthologous sequences P1, . . . , Pa and N1, . . . , Nb from a+ b different species, and a phylogenetic tree relating these species. Assume that for i = 1, . . . , a, Pi is known to have s...
متن کاملA sequence compilation and comparison of exons that are alternatively spliced in neurons.
Alternative splicing is an important regulatory mechanism to create protein diversity. In order to elucidate possible regulatory elements common to neuron specific exons, we created and statistically analysed a database of exons that are alternatively spliced in neurons. The splice site comparison of alternatively and constitutively spliced exons reveals that some, but not all alternatively spl...
متن کاملSplicing of internal large exons is defined by novel cis-acting sequence elements
Human internal exons have an average size of 147 nt, and most are <300 nt. This small size is thought to facilitate exon definition. A small number of large internal exons have been identified and shown to be alternatively spliced. We identified 1115 internal exons >1000 nt in the human genome; these were found in 5% of all protein-coding genes, and most were expressed and translated. Surprisin...
متن کاملAlu-containing exons are alternatively spliced.
Alu repetitive elements are found in approximately 1.4 million copies in the human genome, comprising more than one-tenth of it. Numerous studies describe exonizations of Alu elements, that is, splicing-mediated insertions of parts of Alu sequences into mature mRNAs. To study the connection between the exonization of Alu elements and alternative splicing, we used a database of ESTs and cDNAs al...
متن کاملComparative Component Analysis of Exons with Different Splicing Frequencies
Transcriptional isoforms are not just random combinations of exons. What has caused exons to be differentially spliced and whether exons with different splicing frequencies are subjected to divergent regulation by potential elements or splicing signals? Beyond the conventional classification for alternatively spliced exons (ASEs) and constitutively spliced exons (CSEs), we have classified exons...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 21 Suppl 1 شماره
صفحات -
تاریخ انتشار 2005